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Abstract  -  In  this  paper  we  present  Linear  Transformation 
Algorithm  (LTA),  which  is  based  on  a  new  transformation, 
Linear  Block  Transformation  (LOT).  Experimental  results  show 
that  Linear  Transformation  Algorithm  yields  comparable 
results  to  Burrows-Wheeler  Algorithm  (BWA)  [4]  and 
outperforms  Gzip,  and  Shorten  Waveform  Coder  for  near¬ 
lossless  ECG  compression;  for  lossless  ECG  compression  it 
yields  better  compression  than  all  the  other  techniques. 
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I.  Introduction 

Effective  compression  of  electrocardiogram  (ECG)  signals  is 
required  in  many  applications  including:  (a)  ECG  data 
storage,  (b)  ambulatory  recording  system;  and  (c)  ECG  data 
transmission  over  the  network. 


have  generalized  the  idea  for  permutations  and  introduced  the 
Lexical  Permutation  Sorting  algorithm  (LPSA).  They  have 
shown  that  the  BWT  is  reducible  to  LPSA,  and  LPSA  has 
some  choices  not  available  with  BWA,  when  the  underlying 
data  to  be  transmitted  is  a  permutation. 

Since  the  introduction  of  BWA,  block-sorting  schemes  have 
attracted  great  attention  in  compression  community.  In  this 
work  we  introduce  a  different  block  transformation.  Linear 
Order  Transformation  (LOT),  and  show  that  the  LOT 
transformation  is  faster  than  the  BWT  transformation  and 
yields  better  compression  than  the  BWA,  Gzip  and  Shorten 
Waveform  coder  for  ECG  data. 

II.  Linear  order  transformation 


Although  lossy  compression  yields  significantly  higher 
compression  ratios  while  preserving  diagnostic  accuracy,  due 
to  legal  concerns  it  is  not  usually  employed.  Therefore,  we 
focus  on  lossless  and  near-lossless  compression  of  ECG 
signals. 

Various  researchers  [8],  [9]  and  [10]  have  investigated  the 
transform-based  compression  techniques  for  ECG  data. 
However,  Block  Sorting  techniques  have  not  been  fully 
investigated. 


In  this  work  we  assume  knowledge  of  some  basic 
mathematical  concepts,  including  familiarity  with  elementary 
properties  of  standard  objects  of  discrete  mathematics. 

For  the  definitions  of  permutations  and  multiset  permutations 
interested  reader  should  consult  to  [10].  Given  a  multiset 
permutation  (data  string)  a>  =  [3, 1,3, 1,2],  we  construct  a 
matrix,  M,  by  taking  consecutive  cyclic  left-shifts  of  a>  as  the 
rows  of  M: 


One  of  the  recent  developments  in  the  text  compression  area 
is  the  Block  Sorting  Lossless  Data  Compression  Algorithm 
(BWA)  introduced  by  Burrows  and  Wheeler  [4].  When 
applied  to  text  or  image  data,  BWA  achieves  better 
compression  rates  than  Ziv-Lempel  techniques  with 
comparable  speed,  while  its  compression  performance  is 
close  to  context-based  methods,  such  as  PPM.  The  lexical 
sorting  transformation  utilized  in  BWA  is  called  Burrows- 
Wheeler  Transformation  (BWT). 


M  = 


3  13  12 
13  12  3 
12  3  13 
12  3  13 
2  3  13  1 


By  sorting  the  rows  of  M  lexically  we  transform  it  to,  M’ 


Clearly  et  al.  [5]  viewed  BWA  (called  BW94  by  the  authors) 
as  a  context  based  method,  with  no  predetermined  upper 
bound  to  context  length.  Fenwick  [6],  has  done  a 
comparative  study  on  BWA.  He  concluded  that  BWA  is  a 
“viable  text  compression  technique,  with  a  compression 
approaching  that  of  the  currently  best  compressors  while 
being  much  faster  than  many  other  compressors  of 
comparable  performance”  [6].  Arnavut  and  Magliveras  [1] 


M'  = 


12  3  13 

13  12  3 

2  3  13  1 

3  12  3  1 
3  13  12 


while  by  sorting  the  rows  of  M  linearly  (with  respect  to  the 
first  element  in  each  row),  we  transform  it  to 
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13  12  3 
12  3  13 
M"=  2  3  1  3  1 
3  13  12 
31231. 

Hence,  we  obtain  two  distinct  matrices,  M'  and  M"  with 
respect  to  two  different  orderings.  M'  and  M"  have  the  same 
rows,  but  the  ordering  of  the  rows  is  different.  Let  the  first 
column  of  M’  be  denoted  by  F’,  second  column  of  M’  be 
denoted  by  S’,  and  the  last  column  of  M’  be  denoted  by  L  ’. 
Similarly,  let  the  first  column  ofAf'be  denoted  by  F”,  second 
column  of  M"  be  denoted  by  S”,  and  the  last  column  of  M” 
be  denoted  by  L".  Notice  that  F'  and  F”  are  sorted  values  of 
(0  in  ascending  order,  and  other  columns  are  not. 

Burrows  and  Wheeler  [4]  showed  that  for  any  given  L  ’  (the 
last  column  of  M’)  and  the  index  of  the  original  multiset 
permutation  ro  in  M’,  we  can  recover  ro.  We  now  introduce 
the  Linear  Ordering  Transformation  technique  and  show  that 
it  is  faster  than  the  Burrows- Wheeler  Transformation. 

Definition  2.1  The  linearly  ordered  matrix, 

M"  =  M"  m  ,  of  a  multiset  permutation  co  from  an 
underlying  set  X  =  { 1,  2,  ....  m  }  ,  is  defined  as  follows: 

1.  Let  initially  M"  0  =  [  ] . 

2.  M"  v  =  [M"  v-‘  ]  *  [7V],  where  v  e  X,  Tv  =  Rf  R2  *. .  .* 
Rw ,  and  a  *b  denotes  appending  row  b  to  row  a;  Rj  is  the 
multiset  permutation  formed  by  cyclically  left-shifting  co 
in  (k-1)  positions,  and  k  is  the  position  (address)  of  jth  - 
occurrence  of  symbol  v  in  ox. 

To  clarify  the  definition,  we  give  an  example.  Let  ro  = 

[3. 1.3. 1.2]  be  a  multiset  permutation  from  the  set  X  =  {1,2,3 
}.  Initially,  let  M"°  =  [  ]  be  empty.  The  l's  in  ro  occur  in 
positions  two  and  four  respectively.  Since  ro  has  two  v  =  1, 
M"  1  =  \M"  0  ]  *  [7]]  =  [T,]  and  Tx  =  Rx  *  R2.  By  cyclically 
left-shifting  ro  in  (2-1)  =  1  position,  we  obtain  Rt  = 

[1.3. 1.2.3] ;  while  by  cyclically  left-shifting  ro  in  (4-1)  =  3 
positions,  we  acquire  R2  =  [1,2, 3, 1,3].  There  is  only  one  2  in 
(0  and  it  occurs  in  the  fifth  position.  Cyclically  left-shifting  ro 
in  (5-1)  =  4  positions,  we  get  T2  =  R\  =  [2, 3, 1,3,1].  Thus  M"2 
=  [M"1  ]  *  [T2].  There  are  two  3's  in  ro  and  they  occur  at 
positions  one  and  three  respectively.  Hence,  T2  =  Rf  R2,  and 
by  cyclically  left-shifting  ro  in  (1-1)  =  0  position  we  obtain 
R\  =  [3, 1,3, 1,2],  and  by  cyclically  left-shifting  ro  in  (3-1)  =  2 
positions  we  obtain  R2  =  [3, 1,2, 3,1].  Therefore,  M"  =  M"  3  = 
[M"2]*[T3]. 


An  obvious  observation  about  the  linearly  ordered  matrix  M' 


is  this:  For  any  two  given  two-tuples  (F’\  S’\  )  and  (F”kS’\  ) 
from  the  first  two  columns  of  M\  where  F’\  =  F’\,  then  the 
pair  (F’\  ,  S’\  )  appears  earlier  than  the  pair  (F”k  tS\  )  in 
M"  (scanning  from  top  to  bottom)  if  and  only  if  the  pair 
appears  earlier  in  co  (scanning  from  left  to  right).  Clearly,  the 
linear  ordering  induces  a  particular  order  on  the  pairs  of  the 
elements  ( F ”,  .S'”).  Because  a  particular  ordering  is  induced, 
we  can  always  recover  ro  uniquely  if  the  row  index  of  ro  in 
M'  and  the  second  column  S”  of  M ’  are  known.  In  our 
example,  ro  =  [3, 1,3, 1,2]  occurs  at  position  5  and  the  second 
column  is  S”  =  [3, 2, 3, 1,1].  Assume  that  both  the  row  index 
of  ro  and  S”  are  transmitted  to  a  receiver.  Upon  receiving  S”, 
the  receiver  obtains  the  frequencies  of  the  elements  in  S”  by 
using  the  count  sort  [6].  Once  the  frequencies  of  distinct 
elements  in  S"  are  known,  the  receiver  constructs  F”  = 

[1,1, 2, 3, 3]  and  has  the  first  two  columns  (  F”,  S”)  of  M”, 

T 

1  1  2  3  3  ^ 

3  2  3  1  1 


Accessing  to  the  fifth  position  of  ( F ”,  S ”),  the  receiver 
acquires  the  first  two  elements  of  ro,  (F” 5 ,  S’\  )  =  [3,1].  By 
marking  the  fifth  entry,  the  receiver  eliminates  it  from  further 
consideration  (from  the  two-tuple  (F”,  S’’  ).The  receiver 
should  determine  what  follows  S’’5=  1  in  ro  to  find  the  rest 
of  the  elements  of  ro.  To  discover  the  third  element  of  ro,  the 
receiver  scans  the  F”  from  top  to  bottom  to  determine  the 
first  unused  (unmarked)  entry  that  has  a  value  1.  In  our 
example,  this  is  the  first  entry  where  F” \  -  S”=,  =  1.  Hence, 
S” i  =  3  should  follow  S’’ 5  =  1  in  ro.  The  receiver  then 
eliminates  consideration  of  the  first  entry  from  the  two-tuple 
( F ”  ,  S’’  ).  Since  5”i  =  3  is  determined,  the  process  is 
repeated  to  get  the  fourth  element  of  ro.  Again,  the  receiver 
scans  F”  to  determine  the  position  of  the  first  unused  entry 
which  contains  .S'”i  =  3.  By  finding  the  first  unused  entry 
which  contains  the  value  3  at  position  four  in  F”,  the  receiver 
easily  discovers  that  the  fourth  element  of  ro  is  S’)  =  1. 
Again,  this  entry  is  eliminated  from  further  consideration. 
Finally,  to  find  the  fifth  element  of  ro,  the  receiver  scans  to 
find  the  first  unused  entry  which  has  value  1  in  F”.  In  our 
example,  because  the  first  entry  that  has  a  value  1  is  used 
previously,  the  second  entry  is  considered,  F”2  =  1. 

Therefore,  the  last  element  of  ro  is  S”  =  2,  and  ro  = 
[3, 1,3, 1,2], 

The  transformation  described  above  is  called  Linear  Order 
Transformation  (LOT).  The  LOT  transformation  requires 
0(2n)  time.  Let  fj  be  the  frequency  of  symbol  j  in  a  given 
data  stream.  With  one  pass  over  a  given  data,  frequencies  (fi  , 
f’ .....  fm  )  of  different  symbols  can  be  discovered.  Hence,  for 
each  different  symbol  v  in  the  data,  a  starting  pointer 
(address)  Pv  is  determined.  For  example,  for  symbol  i  ,  the 
starting  address  initially  would  be  Pi  =  fi  +  f2  +  . . .  +  f;_i . 


Fig.  1:  Lossless  compression  of  ECG  files 


Fig.  2.  ECG  files  with  loss  value  +  1 . 


1  4  7  10  13  16  19  22 

BOG  Hies 

Fig.  3.  ECG  files  with  loss  value  +  3. 


Fig.  4.  ECG  files  with  loss  value  +  5. 

Using  those  pointers  and  scanning  the  data  from  left  to  right, 
for  each  v  in  the  data  we  write  the  value  of  the  neighboring 
element  of  v  to  the  location  pointed  to  by  the  pointer  of  v.  We 
then  update  the  pointer.  Clearly,  this  operation  constructs  S  ” 
in  0(2n)  time.  The  time  complexity  required  by  the  BWT 
transformation  to  obtain  L '  is  O  (n  log  n  )  time,  because  of  the 
lexical  sorting  [4]  .  Hence,  LOT  is  faster  than  BWT.  To 
construct  the  original  data  of  size  n  from  S”,  LOT  would 
require  0(2n)  time,  which  is  also  the  time  required  by  BWT 
to  construct  data  from  L\  When  LOT  is  followed  by  the 
MTF  [3]  and  Run-Length  coders,  we  call  the  technique 
Linear  Transformation  Algorithm  (LTA),  similar  to  BWA. 

III.  Experimental  results 

Figures  1-4  show  the  compressed  sizes  of  22  different  ECG 
files  with  four  different  techniques.  All  the  files  employed  in 
this  experiment  are  obtained  from  Prof.  Memon.  Each  ECG 
file  is  of  size  12000  bytes  and  each  ECG  signal  in  the  file  is 
recorded  with  10-bits.  Since  each  ECG  signals  is  recorded 
with  10-bits,  the  BWA  and  LAT  algorithms  are  modified 
accordingly.  However,  the  Gzip  and  Shorten  Wave  Coder 
are  obtained  from  public  sites  and  are  used  without  any 
modifications. 

Figure  1  indicates  that  LTA  scheme  yields  best  compression 
gain  out  of  the  three  techniques,  BWA,  Gzip  and  Shorten, 
when  the  files  are  compressed  without  any  loss.  Observe  that, 
when  the  data  files  to  be  compressed  have  loss  value  ±  1 
(Figure  2)  the  LTA  and  BWA  yields  almost  the  same 
compression  gain.  As  can  be  seen  in  Figure  3  and  4,  when  the 
degree  of  lost  increases,  BWA  algorithm  performs  better  than 
the  other  techniques. 


IV.  Conclusion 


In  this  work,  by  expanding  the  theoretical  foundations  of 
Lexical  Permutation  Sorting  Algorithm  [1]  we  introduced  a 
new  blocks-sorting  technique,  Linear  Order  Transformation, 
and  showed  that  LOT  is  faster  than  the  BWT  transformation  . 

We  have  shown  that  when  one  transforms  the  data  with  the 
LOT  transformation  followed  by  the  MTF  and  Run-Length 
coding,  the  compression  gain  obtained  is  better  than  the 
recently  introduced  BWA  and  the  other  well-known 
compression  schemes,  such  as  Gzip  and  Shorten  Waveform 
coder  for  lossless  ECG  data.  We  also  have  shown  that  the 
Block  Transformed  data  yields  better  compression  than  Gzip 
and  Shorten  Waveform  coder  in  lossless  and  near-lossless 
cases. 

Our  future  work  will  involve  with  extending  this  work  to 
compare  the  results  of  all  the  transformed  based  coding 
schemes,  such  as  the  ones  reported  in  [8][9][  10],  However, 
considering  the  results  of  Gzip  as  a  base  for  judgment,  we 
believe  that  Block  Sorting  Transforms  yields  better 
compression  gain  than  the  other  transform  based  coders  for 
ECG  data. 
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